*** LIS Cross-section Data center in Luxembourg

* email: usersupport@lisdatacenter.org 

*** LIS Self Teaching Package 2022

*** Part I: Inequality, poverty, and social policy
*** Stata version

* last change of this version of the syntax: 15-01-2022.

** Exercise 4: Inequality: The Gini Index

use dhi hifactor hpub_i hpub_u hpub_a hiprivate hxitsc hpopwgt nhhmem grossnet using $gt06h, clear
gen miss_comp = 0
quietly replace miss_comp=1 if dhi==. | hifactor==. | hpub_i==. | hpub_u ==. | hpub_a == . | hiprivate==. | hxitsc==.
quietly drop if miss_comp==1
* select only records if dhi filled 
drop if dhi==. 
* recode negative dhi into zero
gen dhi_tb=dhi
replace dhi_tb=0 if dhi<0

* Apply top and bottom codes / outlier detection
gen dhi_log=log(dhi_tb) 
* keep negatives and 0 in the overall distribution of non-missing dhi 
replace dhi_log=0 if dhi_log==. & dhi_tb!=.  
* detect interquartile range 
qui sum dhi_log [w=hpopwgt],de 
gen iqr=r(p75)-r(p25) 
* detect upper bound for extreme values 
gen upper_bound=r(p75) + (iqr * 3) 
gen lower_bound=r(p25) - (iqr * 3) 
* top code income at upper bound for extreme values 
replace dhi_tb=exp(upper_bound) if dhi_tb>exp(upper_bound)  
* bottom code income at lower bound for extreme values 
replace dhi_tb=exp(lower_bound) if dhi_tb<exp(lower_bound)  

* Apply lis equivalence scale  
gen edhi_tb = dhi_tb/(nhhmem^0.5)

* Generate per capita dhi (top and bottom coded)  
gen pcdhi_tb = dhi_tb/nhhmem

* Computing gini coefficient for the three versions of household income 
ineqdec0 dhi_tb [w=hpopwgt]
ineqdec0 pcdhi_tb [w=hpopwgt*nhhmem]
ineqdec0 edhi_tb [w=hpopwgt*nhhmem]
